home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Danny Amor's Online Library
/
Danny Amor's Online Library - Volume 1.iso
/
bbs
/
society
/
society.lha
/
PUB
/
isoc_news
/
1-2
/
n-1-2-005.10a
< prev
next >
Wrap
Text File
|
1995-07-21
|
10KB
|
187 lines
005.10 How Big is the Internet?
by Michael F. Schwartz
<schwartz@latour.cs.colorado.edu>
The question often arises, "How big is the Internet?" To answer this
question, we must first define what we wish to measure. At one time,
connectivity via the IP protocol suite defined the Internet. Since a
number of protocols now coexist on the Internet, some people have
suggested defining the Internet instead by a common name space (perhaps
the Domain Naming System or X.500). This definition is counterintuitive,
since it elides differences between various types of physical
connectivity. In particular, it does not distinguish the parts of the
network that can support interactive applications (like remote login) from
dialup-based, mail-only connections. Given the advantages of interactive
connectivity and the growing popularity of IP, in this article I consider
only the interconnected IP Internet.
M. Lottor recently published in RFC 1296 the
results of a ten year study that counted
the number of hosts in domains that have IP addresses registered in the
DNS (as opposed to domains that register only "mail exchange" (MX) records
that allow mail to be forwarded to through an intermediary host).
In the early years the data were extracted from host tables
maintained by the DDN Network Information Center. Later, measurements
were taken by a program that recursively descends the Domain Naming tree,
retrieving information about all domains that allow "zone transfers".
Many of the hosts counted by Lotor's study are hidden behind secure
gateways or otherwise not directly connected to the Internet. Therefore,
Lottor's study really indicates the spread of IP and the Domain Naming
System at sites connected to the Internet. A more meaningful
measure of Internet size is the number of domains at which common network
services can be contacted, since it is through such services that a site
gains the advantages of connectivity.
A study that tracks changes in service-level reachability in the Internet
is now underway.
While the measurements will not be complete until the end of 1992,
the first set of measurements that have been collected can be used to
characterize the current size of the interconnected IP Internet. The
final study will provide much more information than just Internet size.
It will indicate relative growth rates among different countries, trends
in the types of services to which sites limit access, how sites limit
access to these services, and the types and geographical distribution of
sites that distance themselves from the Internet.
Starting with a large list of domains, my study attempts to
connect to the following TCP/IP services at each domain:
__________________________________________________________________
Port Number Service Port Number Service
------------------------------------------------------------------
13 daytime 111 Sun portmap
15 netstat 513 rlogin
21 FTP 514 rsh
23 telnet 540 UUCP
25 SMTP 543 klogin
53 Domain Naming System 544 krcmd, kshell
79 finger
__________________________________________________________________
This list was chosen to span a representative range of service types,
each of which can be expected to be found on any machine in a site (so
that probing random machines is meaningful). The one exception is the
Domain Naming System, for which the machines to probe are selected from
information obtained from the Domain system itself. Only TCP services
are tested, since the TCP connection mechanism allows one to determine
if a server is running in an application-independent fashion.
From a list of approximately 12,700 Internet domains worldwide
(generated from Lottor's January 1991 data plus a number of other
sources), successful connections were recorded to at least one of the
above services in 4,455 domains, broken down by top-level domain as
follows:
_________________________________________________________________
Top-level Description Number of Domains Reachable by
Domain Name Measured Internet Services
------------------------------------------------------------------
edu U.S. Educational 2048
com U.S. Commercial 494
ca Canadian 299
au Australian 278
de German 174
se Swedish 167
gov U.S. Government 128
mil U.S. Military 115
jp Japanese 106
net Named by network 96
nl Dutch 84
org Non-profit 56
fr French 55
no Norwegian 55
fi Finnish 45
uk British 44
it Italian 39
dk Danish 38
at Austrian 21
nz New Zealand 21
ch Swiss 20
il Israeli 16
is Icelandic 8
es Spanish 8
kr Korean 5
be Belgian 4
gr Greek 4
za South African 4
br Brazil 3
ie Irish 3
tw Taiwanese 3
us Other U.S. 3
arpa ARPANET names 2
mx Mexican 2
sg Singapore 2
hk Honk Kong 1
in Indian 1
int International 1
pt Portuguese 1
tn Tunisian 1
------------------------------------------------------
This list is a lower bound, since it depends on the span of the
initial list of domains. Nonetheless, the measurements provide an
interesting point of comparison. For example, it is clear that the
number of USA sites is much larger than the number of sites in any
other country in the world. In fact, there are nearly twice as many
USA sites as sites in all other countries combined. However, given
the rapid growth rate of IP connectivity in other countries, within one
to two years there will be more sites internationally than in
the USA.
To help underscore the distinction between service-level
connectivity and IP host count at Internet sites, it was found that 7,242
domains in Lottor's January 1991 list (out of 11,194 in that list) were
not reachable by the above Internet services. The ratio of service
reachable to all IP domains may continue to decrease, as security
problems garner increasing concern. The results of the study will help
uncover the trend here.
The services reached by my measurement software were as follows:
___________________________________
Service Number of Domains
telnet 4170
FTP 4027
SMTP 3952
rlogin 3811
rsh 3777
finger 3637
daytime 3492
Sun portmap 3421
UUCP 2217
Domain 1803
netstat 294
klogin 95
krcmd, kshell 93
----------------------------
From this list it is clear that the "Big Three" applications
(remote login, file transfer, and mail) are the main services in use.
Interestingly, UUCP appears in more domains than DNS, even though TCP
based UUCP (as opposed to dialup UUCP) is being phased out of
existence, as NNTP gains popularity. The reason for this is probably
two fold. First, most domains contract DNS service from other domains,
to avoid the administrative effort required to run a Domain server.
Second, many computers probably come with UUCP configured in by the
manufacturer.
For additional information and metrics, other recent work is now available.
The size of the set of computer networks interconnected for at least
mail or news service referred to as "The Matrix" is discussed by John
Quarterman in his book and newsletters by the same name. The diameter
of the interpersonal communication graph enabled by electronic mail is
discussed in the paper "Discovering Shared Interests Among People Using
Graph Analysis of Global Electronic Mail Traffic" prepared by Schwartz
and Wood at the Univsity of Colorado Department of Computer Science.
Anyone who is considering performing measurement studies of the Internet
is urged to read Vint Cerf's "Guidelines for Internet Measurement
Activities" in RFC 1262, Oct. 1991.
* Assistant Professor, Dept of Computer Science, Univ. of Colorado
Boulder, Colorado, USA